Examining the Relationship Between Demographics, Admission Types, and Clinical Outcomes in New York Inpatients (Niagara County)

1. Introduction

Due: October 22nd (Wednesday, End of Day)

This R Markdown performs an exploratory data analysis (EDA) for the dataset Hospital_Inpatient_Discharges.csv. It contains code, explanation, dataset summary, descriptive statistics, graphical displays, variance/SD measures, normality checks, and initial statistical tests.

2. Load dataset

## Dataset dimensions: Rows: 21075 Columns: 34
## Rows: 21,075
## Columns: 34
## $ Health.Service.Area                 <chr> "Western NY", "Western NY", "Weste…
## $ Hospital.County                     <chr> "Niagara", "Niagara", "Niagara", "…
## $ Operating.Certificate.Number        <int> 3101000, 3101000, 3101000, 3101000…
## $ Facility.ID                         <int> 565, 565, 565, 565, 565, 565, 565,…
## $ Facility.Name                       <chr> "Eastern Niagara Hospital - Lockpo…
## $ Age.Group                           <chr> "18 to 29", "18 to 29", "0 to 17",…
## $ Zip.Code...3.digits                 <chr> "140", "140", "140", "140", "140",…
## $ Gender                              <chr> "M", "F", "F", "M", "F", "F", "M",…
## $ Race                                <chr> "White", "White", "White", "White"…
## $ Ethnicity                           <chr> "Not Span/Hispanic", "Not Span/His…
## $ Length.of.Stay                      <chr> "1", "2", "2", "5", "2", "2", "2",…
## $ Type.of.Admission                   <chr> "Emergency", "Emergency", "Newborn…
## $ Patient.Disposition                 <chr> "Home or Self Care", "Home or Self…
## $ Discharge.Year                      <int> 2012, 2012, 2012, 2012, 2012, 2012…
## $ CCS.Diagnosis.Code                  <int> 7, 190, 218, 120, 195, 188, 130, 2…
## $ CCS.Diagnosis.Description           <chr> "Viral infection", "Fetal distress…
## $ CCS.Procedure.Code                  <int> 4, 137, 228, 76, 137, 134, 39, 115…
## $ CCS.Procedure.Description           <chr> "DIAGNOSTIC SPINAL TAP", "OT PRCS …
## $ APR.DRG.Code                        <int> 723, 560, 640, 254, 560, 540, 194,…
## $ APR.DRG.Description                 <chr> "Viral illness", "Vaginal delivery…
## $ APR.MDC.Code                        <int> 18, 14, 15, 6, 14, 14, 5, 15, 6, 1…
## $ APR.MDC.Description                 <chr> "Infectious and Parasitic Diseases…
## $ APR.Severity.of.Illness.Code        <int> 1, 1, 1, 1, 2, 1, 2, 1, 1, 2, 1, 2…
## $ APR.Severity.of.Illness.Description <chr> "Minor", "Minor", "Minor", "Minor"…
## $ APR.Risk.of.Mortality               <chr> "Minor", "Minor", "Minor", "Modera…
## $ APR.Medical.Surgical.Description    <chr> "Medical", "Medical", "Medical", "…
## $ Payment.Typology.1                  <chr> "Self-Pay", "Medicaid", "Medicaid"…
## $ Payment.Typology.2                  <chr> "", "", "", "Medicare", "", "", ""…
## $ Payment.Typology.3                  <chr> "", "", "", "", "", "", "", "", ""…
## $ Birth.Weight                        <int> 0, 0, 3100, 0, 0, 0, 0, 2900, 0, 0…
## $ Abortion.Edit.Indicator             <chr> "N", "N", "N", "N", "N", "N", "N",…
## $ Emergency.Department.Indicator      <chr> "Y", "N", "N", "Y", "N", "N", "N",…
## $ Total.Charges                       <dbl> 4334.14, 2076.62, 1638.00, 9927.27…
## $ Total.Costs                         <dbl> 1560.46, 2221.43, 1982.04, 4933.30…

3. Handle missing value

Here I check if there is a missing value or not. For each column check the missing value.

## # A tibble: 34 × 2
##    `column(variable)`           n_missing
##    <chr>                            <int>
##  1 Health.Service.Area                  0
##  2 Hospital.County                      0
##  3 Operating.Certificate.Number         0
##  4 Facility.ID                          0
##  5 Facility.Name                        0
##  6 Age.Group                            0
##  7 Zip.Code...3.digits                  0
##  8 Gender                               0
##  9 Race                                 0
## 10 Ethnicity                            0
## # ℹ 24 more rows

## No missing values detected. Dataset is clean.

4. Remove Duplicates

Here I check to remove duplicates:

## Number of duplicate rows: 4

5. Descriptive statistics

## Numeric Columns summary:
Data summary
Name num_vars
Number of rows 21071
Number of columns 11
_______________________
Column type frequency:
numeric 11
________________________
Group variables None

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Operating.Certificate.Number 0 1 2880548.75 579505.37 1401014.00 3101000.00 3102000.00 3121001.00 3121001.0 ▁▁▁▁▇
Facility.ID 0 1 576.23 7.07 565.00 574.00 574.00 583.00 585.0 ▅▁▇▃▇
Discharge.Year 0 1 2012.00 0.00 2012.00 2012.00 2012.00 2012.00 2012.0 ▁▁▇▁▁
CCS.Diagnosis.Code 0 1 218.95 191.22 2.00 108.00 153.00 218.00 670.0 ▇▇▁▁▂
CCS.Procedure.Code 0 1 96.46 88.59 0.00 0.00 86.00 202.00 231.0 ▇▂▂▂▆
APR.DRG.Code 0 1 406.24 245.63 4.00 198.00 321.00 640.00 952.0 ▅▇▃▅▂
APR.MDC.Code 0 1 10.13 6.11 1.00 5.00 8.00 15.00 25.0 ▇▇▅▅▁
APR.Severity.of.Illness.Code 0 1 1.97 0.85 1.00 1.00 2.00 2.00 4.0 ▆▇▁▃▁
Birth.Weight 0 1 174.99 739.64 0.00 0.00 0.00 0.00 4900.0 ▇▁▁▁▁
Total.Charges 0 1 12421.15 13878.88 423.98 5084.56 8267.16 14651.49 294515.6 ▇▁▁▁▁
Total.Costs 0 1 6607.40 7378.03 103.31 2634.34 4378.96 7826.79 187205.7 ▇▁▁▁▁
## Categorical Columns Summary (first 5 columns):
## Column: Health.Service.Area 
## Western NY 
##      21071 
## 
## Column: Hospital.County 
## Niagara 
##   21071 
## 
## Column: Facility.Name 
## 
##        Niagara Falls Memorial Medical Center 
##                                         6487 
##    Mount St Marys Hospital and Health Center 
##                                         5587 
## Eastern Niagara Hospital - Lockport Division 
##                                         4553 
##                    Degraff Memorial Hospital 
##                                         2802 
##  Eastern Niagara Hospital - Newfane Division 
##                                         1642 
## 
## Column: Age.Group 
## 
## 70 or Older    50 to 69    30 to 49    18 to 29     0 to 17 
##        7523        5974        3852        2369        1353 
## 
## Column: Zip.Code...3.digits 
## 
##  143  140  141  142       OOS 
## 8077 6436 5261  870  189  160

Emergency vs Elective Admissions — Length of Stay

============================================================

Q4. Are hospital stays significantly longer for elective admissions compared to emergency admissions in Niagara County (SPARCS 2012 data)?

============================================================

## 
## To examine whether elective admissions are associated with longer hospital stays than emergency admissions using SPARCS 2012 Niagara County data.
## 
## Why it matters:
## Hospital efficiency and patient-flow optimization rely on understanding LOS variation.
los admission_type age_group gender
1 Emergency 18 to 29 M
2 Emergency 18 to 29 F
5 Emergency 70 or Older M
2 Elective 18 to 29 F
2 Elective 0 to 17 F
2 Elective 70 or Older M
##       los               admission_type        age_group    gender   
##  Min.   :  1.00   Elective     : 4145   0 to 17    : 196   F:11133  
##  1st Qu.:  2.00   Emergency    :15083   18 to 29   :2161   M: 8095  
##  Median :  3.00   Newborn      :    0   30 to 49   :3662            
##  Mean   :  5.41   Not Available:    0   50 to 69   :5826            
##  3rd Qu.:  6.00   Urgent       :    0   70 or Older:7383            
##  Max.   :112.00

## Levene's Test for Homogeneity of Variance (center = median)
##          Df F value    Pr(>F)    
## group     1  246.15 < 2.2e-16 ***
##       19226                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

## [1] Emergency Elective 
## Levels: Elective Emergency
## 
##  Welch Two Sample t-test
## 
## data:  los by admission_type
## t = 12.229, df = 5290.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means between group Elective and group Emergency is not equal to 0
## 95 percent confidence interval:
##  1.405815 1.942575
## sample estimates:
##  mean in group Elective mean in group Emergency 
##                6.723522                5.049327
## Elective mean LOS: 6.72 days
## Emergency mean LOS: 5.05 days
## Difference in means: 1.67 days
## Elective stays are 24.9 % longer than Emergency stays.
## Cohen's d |       95% CI
## ------------------------
## 0.26      | [0.23, 0.30]
## 
## - Estimated using pooled SD.
## 
##  Welch Two Sample t-test
## 
## data:  log_los by admission_type
## t = 8.2019, df = 5730.7, p-value = 2.894e-16
## alternative hypothesis: true difference in means between group Elective and group Emergency is not equal to 0
## 95 percent confidence interval:
##  0.08261313 0.13450816
## sample estimates:
##  mean in group Elective mean in group Emergency 
##                1.673387                1.564826
## 
## Call:
## lm(formula = los ~ admission_type + age_group + gender, data = dfq)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
##  -6.615  -3.411  -1.846   0.994 106.994 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              5.61938    0.46468  12.093  < 2e-16 ***
## admission_typeEmergency -1.89652    0.11518 -16.466  < 2e-16 ***
## age_group18 to 29        0.41086    0.47567   0.864   0.3877    
## age_group30 to 49        0.79208    0.46696   1.696   0.0899 .  
## age_group50 to 69        0.95356    0.46249   2.062   0.0392 *  
## age_group70 or Older     1.28267    0.46099   2.782   0.0054 ** 
## genderM                  0.71278    0.09367   7.609 2.89e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.368 on 19221 degrees of freedom
## Multiple R-squared:  0.01648,    Adjusted R-squared:  0.01617 
## F-statistic: 53.67 on 6 and 19221 DF,  p-value: < 2.2e-16
## 
## • Emergency patients stay significantly SHORTER than elective patients.
## • Observed mean ratio (Emergency/Elective) ≈  0.751 .
## • Welch t-test: p-value from t_res =  6.2e-34 .
## • Effect size is small-to-moderate (report the actual Cohen's d below).
## • Implication: elective (often surgical) cases drive longer LOS; plan capacity accordingly.

Statistical Interpretation:

The results show that elective admissions have significantly longer hospital stays than emergency admissions in Niagara County based on the SPARCS 2012 data. The null hypothesis assumed there was no difference in average stay between the two admission types, while the alternative stated that elective admissions stay longer. The findings reject the null hypothesis, with elective patients staying about 1.7 days more on average (6.7 vs 5.0 days, p < 0.001). The small effect size indicates the difference, though statistically meaningful, is moderate in practice. This pattern makes sense because elective cases often involve planned surgeries and recovery periods, whereas emergency cases are treated and discharged more quickly.